经常性的神经网络(RNNS)是用于顺序建模的强大工具,但通常需要显着的过分识别和正则化以实现最佳性能。这导致在资源限制的环境中部署大型RNN的困难,同时还引入了近似参数选择和培训的并发症。为了解决这些问题,我们介绍了一种“完全张化的”RNN架构,该架构使用轻质的张力列车(TT)分解在每个反复电池内联合编码单独的权重矩阵。该方法代表了一种重量共享的新形式,其减少了多个数量级的模型大小,同时与标准RNN相比保持相似或更好的性能。图像分类和扬声器验证任务的实验表明了减少推理时间和稳定模型培训和封闭表选择的进一步益处。
translated by 谷歌翻译
Regularising the parameter matrices of neural networks is ubiquitous in training deep models. Typical regularisation approaches suggest initialising weights using small random values, and to penalise weights to promote sparsity. However, these widely used techniques may be less effective in certain scenarios. Here, we study the Koopman autoencoder model which includes an encoder, a Koopman operator layer, and a decoder. These models have been designed and dedicated to tackle physics-related problems with interpretable dynamics and an ability to incorporate physics-related constraints. However, the majority of existing work employs standard regularisation practices. In our work, we take a step toward augmenting Koopman autoencoders with initialisation and penalty schemes tailored for physics-related settings. Specifically, we propose the "eigeninit" initialisation scheme that samples initial Koopman operators from specific eigenvalue distributions. In addition, we suggest the "eigenloss" penalty scheme that penalises the eigenvalues of the Koopman operator during training. We demonstrate the utility of these schemes on two synthetic data sets: a driven pendulum and flow past a cylinder; and two real-world problems: ocean surface temperatures and cyclone wind fields. We find on these datasets that eigenloss and eigeninit improves the convergence rate by up to a factor of 5, and that they reduce the cumulative long-term prediction error by up to a factor of 3. Such a finding points to the utility of incorporating similar schemes as an inductive bias in other physics-related deep learning approaches.
translated by 谷歌翻译
软机器人抓手有助于富含接触的操作,包括对各种物体的强大抓握。然而,软抓手的有益依从性也会导致重大变形,从而使精确的操纵具有挑战性。我们提出视觉压力估计与控制(VPEC),这种方法可以使用外部摄像头的RGB图像施加的软握力施加的压力。当气动抓地力和肌腱握力与平坦的表面接触时,我们为视觉压力推断提供了结果。我们还表明,VPEC可以通过对推断压力图像的闭环控制进行精确操作。在我们的评估中,移动操纵器(来自Hello Robot的拉伸RE1)使用Visual Servoing在所需的压力下进行接触;遵循空间压力轨迹;并掌握小型低调的物体,包括microSD卡,一分钱和药丸。总体而言,我们的结果表明,对施加压力的视觉估计可以使软抓手能够执行精确操作。
translated by 谷歌翻译
人们经常通过双手施加压力来与周围环境互动。虽然可以通过在手和环境之间放置压力传感器来测量手动压力,但这样做可以改变接触力学,干扰人类触觉感知,需要昂贵的传感器,并且对大型环境的扩展很差。我们探索使用常规的RGB摄像头推断手动压力的可能性,从而使机器对无爆炸的手和表面的手动压力感知。中心洞察力是,通过手的施加压力会导致内容丰富的外观变化。手共有生物力学特性,从而产生相似的可观察现象,例如软组织变形,血液分布,手姿势和铸造阴影。我们收集了36位参与者的视频,这些参与者具有不同的肤色,向仪器的平面表面施加压力。然后,我们训练了一个深层模型(压力visionnet),以从单个RGB图像中推断出压力图像。我们的模型会在培训数据外降低给参与者的压力,并且表现优于基准。我们还表明,我们的模型的输出取决于手的外观,并在接触区域附近投射阴影。总体而言,我们的结果表明,可以使用以前未观察到的人手的出现来准确推断施加压力。数据,代码和模型可在线提供。
translated by 谷歌翻译
虽然机器人提供了一个机会,为老年人和床上移动性损伤的人提供物理援助,但人们经常在床上休息,毯子覆盖着他们的大部分的毯子。为许多日常自我保健任务提供帮助,例如沐浴,敷料或守护,护理人员必须先从人体的一部分揭开毯子。在这项工作中,我们介绍了一个关于机器人床上用品操作的制定,其中一个机器人从目标身体部位揭开毯子,同时确保人体的其余部分仍然被覆盖。我们比较两种方法来优化提供具有掌握和释放点的机器人的策略,即揭示身体的目标部分:1)加强学习和2)通过优化来生成培训数据的自我监督学习。我们在物理模拟环境中培训并进行了评估,该政策包括覆盖床上模拟人类仰卧的可变形布网格。此外,我们还将模拟训练的政策转移到真正的移动操纵器,并证明它可以从躺在床上的人体模型的目标身体部位揭开毯子。源代码在线获取。
translated by 谷歌翻译
在本文中,我们着重于分析使用大型材料数据库材料识别的触觉传感的热模式。许多因素会影响热识别性能,包括传感器噪声,传感器和物体的初始温度,材料的热积液以及接触时间。为了分析这些因素对热识别的影响,我们使用了一个半无限固体的热模型来模拟来自CES Edupack Level-1数据库中所有材料的热传输数据。我们使用支持矢量机(SVM)来预测2346个材料对的二元材料识别的F1分数。我们还使用配备了热传感器的真实机器人收集了数据,并分析了其在66个现实世界对的材料识别性能。此外,我们分析了对模型进行模拟数据培训并在实体机器人数据上进行测试时的性能。我们的模型预测了模拟数据的0.980 F1分数的材料识别性能,现实世界中具有恒定初始传感器温度的现实世界数据的0.994 F1得分,现实世界数据的0.966 F1得分具有不同的初始传感器温度,并且0.815 SIM到运行转移的F1分数。最后,我们根据从这些结果中获得的见解提供了一些有关传感器设计和参数选择的准则。我们发布了模拟和实体机器人数据集,以供机器人社区进一步使用。
translated by 谷歌翻译
There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .
translated by 谷歌翻译
The purpose of this work was to tackle practical issues which arise when using a tendon-driven robotic manipulator with a long, passive, flexible proximal section in medical applications. A separable robot which overcomes difficulties in actuation and sterilization is introduced, in which the body containing the electronics is reusable and the remainder is disposable. A control input which resolves the redundancy in the kinematics and a physical interpretation of this redundancy are provided. The effect of a static change in the proximal section angle on bending angle error was explored under four testing conditions for a sinusoidal input. Bending angle error increased for increasing proximal section angle for all testing conditions with an average error reduction of 41.48% for retension, 4.28% for hysteresis, and 52.35% for re-tension + hysteresis compensation relative to the baseline case. Two major sources of error in tracking the bending angle were identified: time delay from hysteresis and DC offset from the proximal section angle. Examination of these error sources revealed that the simple hysteresis compensation was most effective for removing time delay and re-tension compensation for removing DC offset, which was the primary source of increasing error. The re-tension compensation was also tested for dynamic changes in the proximal section and reduced error in the final configuration of the tip by 89.14% relative to the baseline case.
translated by 谷歌翻译
Coronary Computed Tomography Angiography (CCTA) provides information on the presence, extent, and severity of obstructive coronary artery disease. Large-scale clinical studies analyzing CCTA-derived metrics typically require ground-truth validation in the form of high-fidelity 3D intravascular imaging. However, manual rigid alignment of intravascular images to corresponding CCTA images is both time consuming and user-dependent. Moreover, intravascular modalities suffer from several non-rigid motion-induced distortions arising from distortions in the imaging catheter path. To address these issues, we here present a semi-automatic segmentation-based framework for both rigid and non-rigid matching of intravascular images to CCTA images. We formulate the problem in terms of finding the optimal \emph{virtual catheter path} that samples the CCTA data to recapitulate the coronary artery morphology found in the intravascular image. We validate our co-registration framework on a cohort of $n=40$ patients using bifurcation landmarks as ground truth for longitudinal and rotational registration. Our results indicate that our non-rigid registration significantly outperforms other co-registration approaches for luminal bifurcation alignment in both longitudinal (mean mismatch: 3.3 frames) and rotational directions (mean mismatch: 28.6 degrees). By providing a differentiable framework for automatic multi-modal intravascular data fusion, our developed co-registration modules significantly reduces the manual effort required to conduct large-scale multi-modal clinical studies while also providing a solid foundation for the development of machine learning-based co-registration approaches.
translated by 谷歌翻译
Compliance in actuation has been exploited to generate highly dynamic maneuvers such as throwing that take advantage of the potential energy stored in joint springs. However, the energy storage and release could not be well-timed yet. On the contrary, for multi-link systems, the natural system dynamics might even work against the actual goal. With the introduction of variable stiffness actuators, this problem has been partially addressed. With a suitable optimal control strategy, the approximate decoupling of the motor from the link can be achieved to maximize the energy transfer into the distal link prior to launch. However, such continuous stiffness variation is complex and typically leads to oscillatory swing-up motions instead of clear launch sequences. To circumvent this issue, we investigate decoupling for speed maximization with a dedicated novel actuator concept denoted Bi-Stiffness Actuation. With this, it is possible to fully decouple the link from the joint mechanism by a switch-and-hold clutch and simultaneously keep the elastic energy stored. We show that with this novel paradigm, it is not only possible to reach the same optimal performance as with power-equivalent variable stiffness actuation, but even directly control the energy transfer timing. This is a major step forward compared to previous optimal control approaches, which rely on optimizing the full time-series control input.
translated by 谷歌翻译